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A noisy underdetermined system of linear equations is considered in which a sparse vector (a vector 
with a few non-zero elements) is subject to measurement. The measurement matrix elements are drawn 
from a Gaussian distribution. We study the information-theoretic constraints on exact support recovery of 
a sparse vector from the measurement vector and matrix. The existing information-theoretic bounds and 
conditions are applied to strictly sparse signals. We compute a tight, sufficient condition that is applied to 
both approximately and strictly sparse signals. Finally, we compare our results with the existing bounds 
0^ . and recovery conditions. 

o 

. I. Introduction 

o: 

CN ■ Solving underdetermined systems of linear equations appears in various applications. In general, they 

have an infinite number of solutions. Recent studies show that these systems of linear equations have 
unique solutions (if the solution is sparse and the system is noise-free) under certain conditions given in 
[1,2]. 

We consider a noisy system of linear equations for which it is a priori known that the solution is 
fc-sparse (a vector with k nonzero elements). 

Y = Xfi + W (1) 

where X G W ixp is a random Gaussian measurement matrix with independently and identically distributed 
(i.i.d.) elements Xij ~ AA(0, 1). /3 is a /c-sparse vector subject to measurement. W is a Gaussian noise 
vector W ~ M(0, I n xn)- The estimation of /3 as a function of X and Y is an inverse problem that 
consists of (a) detecting the support and (b) estimating the amplitudes of the nonzero elements. Once the 
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support set of $ is determined, the estimated sparse solution (optimal) [3,4] is 



(3 = arg min | \Y — X 



supp(/3) 




(2) 



i/ 



where 



supp(/3) = {i : ^ ± 0} 



is the support set of (3. X 

in supp(/3). 



supp(/3) 



is a n x k sub-matrix of the measurement matrix with column indices 



In this paper, we are concerned with the exact recovery of the solution support that results in the 
optimal estimation of the solution in (2) [3]. Specifically, we compute a sufficient condition depending 
on the number of measurements n, the vector dimension p, the sparsity level k and the signal-to-noise 
ratio 



where the noise variance a 2 = 1 and the measurement matrix elements are drawn from a Gaussian 
random source output with unit variance and zero mean. 
An ensemble of sparse vectors is characterized by 



where is a nonzero element of (3. The optimal recovery of a signal in ensemble C Pi fc(A) is not solely 
guaranteed by its SNR [4, 5]. We consider the case in which the decoder has the highest failure probability, 
by taking |/3j| = A where i € supp(/3). Therefore, the recovery of any sparse signal in C p ^{\) with 
SNR > k\ 2 is guaranteed [4-6]. 

In signal processing, natural random sources are widely modeled as ergodic wide-sense stationary 
(EWSS) [7, 8]. In [9, 10] the authors model the support of a sparse signal as a vector of random elements; 
in which an element is an outcome of a random source with probability k/p to be nonzero. In [11, 12] 
the authors model the high-dimensional sparse vector (3 as a realization of an ergodic stationary source. 
In [5] the authors assume that (3 has a wide-sense stationary source. Therein, Wang et. al. compute the 
first and second order statistics of (3 which imply that the random vector source is assumed to be at least 
wide-sense stationary. In the following we assume an ergodic wide-sense stationary source for random 
sparse vector (3. 

Practical compressible vectors are approximately sparse (with k large elements and the rest are small 
but nonzero) [13]. It also may not be possible to explicitly obtain the sparsity level k in such vectors. In 
some applications such as blind deconvolution the sparse signal is filtered for which the existing bounds 



SNR 



mm 2 } 



11/311 



2 
2- 



(3) 



C M (A) := {(3 G W : |supp(/3)| = k, |&| > A} 
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(based on sparsity level) do not work [14]. To overcome this issue, we exploit the minimum eigenvalue 
of the signal autocorrelation matrix instead of sparsity level k (see sections II and III). 

A. Previous Work 

There is a large body of literature [3-6, 15-18] in which the support recovery of sparse signals for 
random measurement matrices has been studied. Therein, various techniques have been exploited for 
computation of the recovery constraints. In [16] a necessary condition is given using Shannon source- 
channel separation theorem, 

log (?) 

In [4-6, 15, 17] Fano's inequality and Chernoff bound are used. In [3] Shannon theory, joint typicality 
decoder and Fano's inequality are used to compute the information-theoretic constraints. In this work, 
we are concerned with necessary and sufficient conditions that depend on (n,p, k, A) and we use Fano 
and Brunn-Minkowski inequalities [19]. 

In section II we present our results and compare them with existing bounds and conditions. In section 
III the proof of the results in section II are given. Finally, in section IV we conclude our work. 

II. Results 

Theorem 1: Assume a measurement matrix X 6 W nxp whose elements are drawn from the outcome 
of an i.i.d. Gaussian random source with zero mean and unit variance, i.e., Xij ~ J\f(0, 1). A sufficient 
condition for asymptotically reliable recovery over the signal class C Pj fc(A) is 

log(( P ~ k + m )-l)-log(2)<L(m,k,X,n,p), (5) 



m 



for m = 1, . . . , k. 



Tl 

L(m, k, A, n, p) = - log [1 + exp (F/n + G)] . (6) 



p—k+m—j 

-nj ■ ' 

3=1 1=1 

where 7 is Euler's constant and 



n p—k+m—j 

+£ E 7 



G = loe 



? ^- A2 

p — k + m p — k + m 



The proof of Theorem 1 exploits Fano's inequality where the support error probability tends to zero. 
Parameter G in Theorem 1 is the logarithm of the minimum eigenvalue of the signal autocorrelation 
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matrix (see section III). This may not be tight for asymptotic values of sparsity level k. Therefore, we 
replace G in Theorem 1 where k = o(p) and k = Q(p). 

Corollary 1: Assume k = o(p). In Theorem 1 the sufficient condition for asymptotically reliable 
recovery over the signal class C P ^(X) is obtained by replacing 

mX 2 



G = log 



(V) 



p — k + m 
in (6). 

Corollary 2: Assume k = O(p). In Theorem 1 the sufficient condition for asymptotically reliable 
recovery over the signal class C p ^(X) is obtained by replacing 

G= _L f W ]ogS(u)dLj (8) 
27r Jo 

in (6), where S(oj) is the power spectrum of the sparse signal. 

Theorem 2: [5] Assume the measurement matrix X G W nxp whose elements are drawn from the out- 
come an i.i.d. Gaussian source with zero mean and unit variance. A necessary condition for asymptotically 
reliable recovery over the signal class C P ^{X) is 

n > max{/i(>, k, A), . . . , f k (p, k, X),k} (9) 

where 

log (P- k+m ) - 1 

fm(p,k,X) = - 7 V } \Y ( 10 ) 

\ log 1 + mX 2 1 



2 o ' y± p-k+m ^ 

for m = 1, . . . , k. 

In Fig. 1 we compare the proposed sufficient condition in Theorem 1 with the existing bounds in 
[5, 16]. The horizontal axis is the sparsity ratio a = k/p and the vertical axis is the measurement ratio 
H = n/p. The necessary condition (dashed line) is computed by simulation from (12) for unipolar signals 
(see section III) where the error probability tends to zero. We observe that the sufficient condition in 
Theorem 1 is tighter than the necessary conditions in [5, 16] (Eq. (4) and Theorem 2) with respect to the 
simulation results (dashed line). The authors in [4] claim that their sufficient condition is asymptotically 
tight as the necessary condition in [5]. Thus, we can conclude that Theorem 1 is also tighter than the 
sufficient condition in [4]. 
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Fig. 1. Comparison of the necessary and sufficient conditions. 



III. Proofs 

A. Theorem 1 

To obtain the exact support recovery conditions on the number of measurements, we exploit Fano's 
inequality 

/(g;y) + log2 

en ~ log (od-1) 

where 1(9; Y) is the mutual information between 6 and Y [5,6, 15, 19]. 9 is the detected support set that 
is given by the decoder, 9 = T>(Y). P err is the probability that the decoder fails to detect the correct 
support set, i.e., 

P„ = Pr[suppG0)^0]. 
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The eiTor probability of the decoder, P en (T>), is the average error probability with respect to Gaussian 
measurement matrix X. This is 

P err (P) = E x P err 

where Ex denotes the expected value operator with respect to X [3,5]. We further assume that the 
decoder has a priori knowledge of all nonzero locations of /3 but m locations with smallest values 
(1 < m < k). The decoder has to choose from ( p ~ n ^ m ) support sets [5]. Let U be the set of unknown 
location indices where \U\ = m. The n-dimensional observation vector is 

Y = X/3 + W 

where X is the measurement matrix with column indices in U. $ is the vector subject to measurement 
with element indices in U. Therefore, the error probability of the decoder is bounded as 

E^/(0;y)+log2 



PerrP) > 1 

i°g(rfr)-i 

The mutual information 1(9; Y) is given by 

I(8;Y) = H(Y\X)-H(Y\6,X) 

= H{Y\X) - H(W) 
1 



(12) 



(13) 



log 



I n + XR^ 



where is the autocorrelation matrix of random vector (3. The equality \I p + AB\ = \I n + BA\ holds 
for any pair of matrices A pxn and B nxp [20]. Therefore, (13) can be rewritten as 

1 



I(9;Y) = - log 



I p ^ k+m + X^XR- p . (14) 

We have ran k(X^XRp) < min{rank(XtX),rank(ii / 3)}. In [21] it is shown that rank(Xt^) = n, which 
implies min{rank(X^X),rank(i? ( g)} = n. 

We get a lower bound on the mutual information by applying the Brunn-Minkowski inequality [19], 



/(0;y)>|log(l + |Xt X iy/« 



log (\ + exp Qlog|£t XR^j . 



(15) 



Now, by taking the expectation in (15) and by using Jensen's inequality we can write 



n 



E x /(0;y)>-log 2 



1 + exp ( -(a + b) 

n 



(16) 
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where 

n 

a = E ji ^2loga l (x^X^j (17) 

i=l 

and 

n 

& = J>g<7 i (i2 j §). (18) 

1=1 

o"i(.) is the ith eigenvalue of a matrix. In (18), the eigenvalues are sorted a\{R^) < G2{Rp) < ■■■ < 

&p— fc+m(-^g)- 

Eq. (17) is the expected value of the logarithm of random Wishart matrix determinant The 
product of the eigenvalues of this random Wishart matrix is distributed as product of n chi-square 
distributed random variables [22,23], and therefore we have 

n n 

i=l j=l 

By taking the logarithm and then the expectation of the right-hand-side of (19) we obtain 

n n p—k+m—j 

F = E^ogX 2 p - k+m - j+ i = -ni + Yl E 7 ™ 

3=1 3=1 t=l 

where 7 w 0.577215664 is Euler's constant [22,23]. 

We model the sparse vector elements as output of an ergodic wide-sense stationary random vector 
source. Therefore, the elements of such random vector is also ergodic wide-sense stationary. The nonzero 
elements appear with probability Pr[/3j ^ 0] = p _™ +m [9, 10,24]. The autocorrelation matrix of /3 is a 
Hermitian Toeplitz matrix that is obtained from its autocorrelation function [7]. We assume the nonzero 
elements where i 6 U, are negative with probability £, i.e., £ = Pr[/3j 6 M - ]. The autocorrelation 
function at lag r = is 

^(0) = _L_X)AA = " A 2 . 

H p — k + m '— J p — k + m 

i 

For lag t/0 and |r| < p, the autocorrelation function is 

r /3<» = — V] hh+r 

p p — k + m / — J 

i 

p — k + m — \t\ , m . 9l9 

= n — irr— ) A 

p—k + m p — k + m 

x [e 2 + (1 - e) 2 - 2^(1 - 0] • 
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For \t\ > p the autocorrelation function rs(r) = 0. Therefore, the autocorrelation function is 

^±^^(4C 2 -4C + 1)A 2 if|r|<p,r^0 



rs(r) = < 



p—k+m' 





if r = 0, 
Otherwise. 



The autocorrelation matrix of $ is a Hermitian Toeplitz matrix with ra(r) in its first row, i.e., 

.Rg = Toeplitzjr^}. 
We use Ra and rs to find a lower bound on (18), 



n log cr min < ^2 lo § °» 



i=l 



where a mm < ai(Rs) is the lower bound on all eigenvalues of Rs. The minimum eigenvalue of Rs is 



lower bounded by the infimum of the power spectrum [7], 

cr min = min Sz(u)) 

u) r 



(21) 



where 



Ss(u) 



sin(q;(p + l/2)) 
sin(w/2) 



m 



y p — k + m 



A) 2 (4£ 2 -4£ + l) 



(22) 



+ 



in 



-A 



-A) 2 (4£ 2 -4£ + l) 



p—k + m p — k + m 

is the power spectrum of /3. It is computed by taking Fourier transform of Tg [7]. Therefore, we obtain 

m , m 



0"min(£) 



(- 



! (4£ 2 -4£ + l) 



(23) 



p — k + m p — k + m 

from (22) and (21). cr m i n is a function of £. The term < 4£ 2 — 4£ + 1 < 1 reaches its maximum when 
£ € {0, 1}, i.e., the sparse vector subject to measurement is unipolar. To lower bound cr m i n we choose 
£ € {0, 1}. Finally, by substituting o~i(Rp) = o"min(0) (the minimum eigenvalue of Rs), (20) and (16) in 
(12) we obtain (5). 

B. Corollaries 

First, we consider k = o(p). If k/p -+ we have p _™ +m — > for m = 1, . . . , k. This results in 
rs(r) — > where |r| < p. Therefore, the autocorrelation matrix Rb becomes a diagonal matrix with 
elements equal to a m \ n = ™£ +m . Noting that SNR is a constant that implies ?nA 2 is constant. By 
substituting in (18) and (16), 



G = lop 



m\ 2 
p — k + m 
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Second, we consider k = Q(p). In (18) the eigenvalues of the autocorrelation matrix are sorted. They 
are n smallest eigenvalues out of p. To tighten the condition in Theorem 1 we use a lower bound in [25] 
on the product of n smallest eigenvalues of the matrix instead of lower bounding (18) by replacing the 
eigenvalues with the minimum eigenvalue. Therefore, we lower bound (18) as 



where the second term in the right-hand-side is the integral of the logarithm of the signal power spectrum 
[25,26]. Noting that inequalities k < p and n > k hold, k = @(p) implies that 



In this paper we considered the ergodic wide-sense stationary sparse signals. The ergodicity and wide- 
sense stationarity are widely used in signal processing to model random information sources. This is 
of great importance for computing bounds where the signal is approximately sparse. In such practical 
situations finding the infimum of the signal power spectrum and replacing its logarithm in (6) is trivial to 
obtain the sufficient condition in Theorem 1 (see section III). We showed that the sufficient condition in 
Theorem 1 is tight when it is compared with the simulation results. Moreover, we tightened the Theorem 
1 for asymptotic values of the sparsity level. 
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